Skip to content

Comments

Fix text_replace() silently corrupting source files containing \?\ on Windows#462

Open
AidanShipperley wants to merge 1 commit intoconda:mainfrom
AidanShipperley:fix/targeted-extended-path-replacement
Open

Fix text_replace() silently corrupting source files containing \?\ on Windows#462
AidanShipperley wants to merge 1 commit intoconda:mainfrom
AidanShipperley:fix/targeted-extended-path-replacement

Conversation

@AidanShipperley
Copy link
Contributor

@AidanShipperley AidanShipperley commented Feb 13, 2026

Summary

  • Replaces the blanket strip of \?\ and //?/ byte sequences from all text data in text_replace() with targeted replacement that only matches extended-length prefixes immediately followed by the placeholder path
  • Preserves legitimate source code referencing \?\ (e.g., huggingface_hub/file_download.py) while still correctly handling dangling extended-length prefixes left after core.py normalization
  • Updates existing tests and adds regression tests for the corruption scenario

Root Cause

PR #432 (released in 0.9.0) fixed #398 by stripping \?\ and //?/ from all text file data on Windows to clean up dangling extended-length path prefixes:

# Previous approach - strips all occurrences unconditionally
data = data.replace(b'\\?\', b'')
data = data.replace(b'//?/', b'')

While this successfully addressed the original issue, it also inadvertently affects any .py file containing those byte sequences as legitimate source code (e.g., "\\?\\" in Python string literals), causing SyntaxError at import time.

Fix

Replace with targeted replacement that only strips extended-length prefixes when immediately followed by the placeholder:

# New approach - only matches prefix + placeholder
data = data.replace(b'\\?\' + placeholder_bytes, new_prefix_bytes)
data = data.replace(b'//?/' + placeholder_bytes, new_prefix_bytes)

This handles all three path cases without touching unrelated source code:

  • \?\C:\envs\myenv -> replaced cleanly (extended-prefix + placeholder as unit)
  • //?/C:\envs\myenv -> replaced cleanly (extended-prefix + placeholder as unit)
  • C:\envs\myenv -> standard replacement
  • Source code with \?\ (not adjacent to placeholder) -> preserved

Testing

Fixes #461

… Windows

Replace blanket strip of \?\ and //?/ byte sequences from all text
data with targeted replacement that only matches extended-length
prefixes immediately followed by the placeholder path.

The previous approach (introduced in PR conda#432 for issue conda#398) would
unconditionally remove all \?\ and //?/ occurrences from text files,
corrupting legitimate source code that references those patterns (e.g.,
huggingface_hub's file_download.py).

The new approach replaces extended-prefix + placeholder as a unit FIRST,
then does the standard placeholder replacement. This correctly handles:
- \?\C:\envs\myenv -> replaced cleanly
- //?/C:\envs\myenv -> replaced cleanly
- C:\envs\myenv -> standard replacement
- Source code with \?\ (not adjacent to placeholder) -> preserved

Fixes conda#461
@github-project-automation github-project-automation bot moved this to 🆕 New in 🔎 Review Feb 13, 2026
@conda-bot conda-bot added the cla-signed [bot] added once the contributor has signed the CLA label Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed [bot] added once the contributor has signed the CLA

Projects

Status: 🆕 New

3 participants